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1,  Introduction 


This  is  a  report  of  research  which  has  been  conducted 
with  the  Directorate  of  Information  Sciences  of  the  Air 
Force  Office  of  Scientific  Research  on  the  Grammatical  Analysis 
of  Spoken  Language,  This  research  project  began  in  October, 
1962,  under  Grant  AF-AFOSR-22-63.  Beginning  in  February, 

1964,  the  research  was  continued  under  Grant  AF-AFOSR-02-64 , 

t 

The  research  under  this  latter  grant  was  carried  on  through 
January,  1965,  and  was  subsequently  extended  through  January, 
1966. 


The  basic  objective  of  the  project  has  been  to  con¬ 
tribute  to  an  increased  understanding  of  the  grammar  of 
spoken  language.  The  machine  translation  of  languages  is 
the  chief  applied  motivation  for  much  of  the  formal  work 
that  has  been  done  on  the  grammar  of  natural  language  during 
recent  years.  Machine  translation  has  thus  far  been  primarily 
concerned  with  printed  language.  Automatic  speech  recognition, 
however,  requires  a  syntax  for  spoken  language,  which  may 
differ  in  several  important  aspects  from  the  syntax  of 
printed  language. 


Certain  of  the  distinctions  between  spoken  and  written 

A- 

language  are  relatively  obvious.  For  example,  printed 
language  usually  consists  of  linear  strings  of  discrete 
symbols,  whereas  spoken  language  consists  of  continuous 
acoustic  waves  ^Sijice  jigpe  of  the  major  aspects  of  printed 


bic  waves*  Since .none  of  the  major 
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or  of  spoken  language  are  fully  understood,  however,  it  is 
not  surprising  that  it  is  exceedingly  difficult  to  formali 
the  distinctions  between  these  two  forms  of  natural  langua 

2.  Research  Program 

In  spoken  language  several  different  types  of  infor 
mat  ion  sure  transmitted  concurrently.  Information  is  trans 
mitted  by  the  phonemic  sequences  and  simultaneously  by  the 
associated  prosodies.  In  spoken  language  there  is  a  great 
variation  in  syntax,  from  single  lexical  items  to  long  and 
involved  expressions  in  which  the  relationships  among  the 
lexical  items  may  be  obscure.  In  spoken  language  the  mood 
of  the  speaker,  e.g.  his  attitude  toward  what  he  is  saying 
may  vary  from  one  utterance  to  another,  and  these  differen 
are  often  expressed  in  ve^y  subtle  ways  in  the  prosodies 
of  the  dialect.  It  was  not  possible  to  investigate  such 
distinctions  within  the  scope  of  the  present  project,  but 
has  been  possible  to  identify  a  number  of  the  basic  problei 
Which  must  be  resolved  before  a  convincing  grammar  of  spok 
language  can  be  constructed.  The  research  which  has  been 
conducted  is  divided  into  four  major  parts,  as  discussed 
below. 

A.  Prosodic  Analysis.  Techniques  for  research  on 
spoken  language  are  somewhat  restricted,  and  each  has 
major  limitations.  There  are  two  primary  techniques  which 
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have  been  employed  in  the  past.  The  first  technique  is  that 
of  speech  analysis  and  uses  utterances  from  natural  speech 
as  test  signals.  We  are  at  present  very  far  from  being 
able  to  make  adequate  instrumental  interpretations  of 
utterances  as  they  occur  in  natural  conversational  speech. 

As  a  result,  it  is  customary  to  conduct  tests  with  utterances 
which  differ  in  some  minimal  linguistic  way,  and  then  to 
study  the  physiological  or  the  acoustical  bases  for  these 
differences.  While  much  important  information  has  been 
obtained  by  this  technique,  it  is  exceedingly  difficult  to 
obtain  an  integrated  understanding  of  spoken  language  from 
the  accumulation  of  such  information. 

The  second  major  technique  is  that  of  speech 
synthesis.  In  speech  synthesis  it  is  possible  to  control 
the  analog  of  physiological  variables  or  to  control  the 
acoustical  variables  in  a  systematic  way.  The  resulting 
utterances  may  then  be  subjected  to  listening  tests  for 
interpretation.  Speech  synthesis  has  the  very  great 
advantage  that  the  variables  can  be  controlled  systematically 
and  according  to  specification.  It  has  the  disadvantage 
that  there  is  no  assurance  that  the  synthesizer  generates 
the  most  relevant  physiological  or  acoustical  parameters. 

A  third  technique  was  developed  under  the  present 
project  for  research  on  spoken  language,  particularly  the 
prosodies.  This  technique  employs  selective  distortion  of 


-li¬ 
the  speech  signal  so  that  certain  types  of  information  are 
obliterated.  With  this  technique,  the  phonemic  informat io: 
may  be  reduced  effectively  to  zero,  and  only  prosodic  infoj 
mation  retain 3d.  In  addition,  fundamental  voice  frequency 
may  optionally  be  included  or  excluded.  Essentially,  the 
experimental  system  flattens  the  power  spectrum  without 
changing  the  instantaneous  power,  and  then  optionally 
reintroduces  harmonics  of  the  fundamental  frequency.  Thus 
the  acoustic  prosodic  parameters  of  average  fundamental 
voice  frequency,  average  speech  power,  and  acoustic  phoneti 
duration  may  be  preserved.  The  phonetic  quality  of  vowels 
and  consonants,  including  information  about  the  secondary 
phonetic  parameters  is  destroyed.  A  simple  extension  of 
the  technique  reported  would  also  make  it  possible  optional 
to  include  or  exclude  variations  in  average  speech  power. 

The  procedure  provides  a  means  of  investigating  the 
information  contributed  by  the  prosodies,  either  singly 

or  jointly.  A  system  to  perform  the  above  indicated  _ 

distortions  was  constructed,  and  an  experiment  with  the 
system  was  conducted.  It  was  found  that  the  system  did, 
indeed,  obliterate  the  phonemic  information,  while  the 
prosodic  information  was  retained.  When  the  channel  for 
average  fundamental  voice  frequency  was  eliminated,  correct 
listener  responses  to  stress  on  English  words  decreased 
only  slightly  but  the  listener  responses  to  intonation 
approached  the  chance  level. 
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It  is  well  known  that  the  contributions  to  speech 

intelligibility  of  various  frequency  bands  throughout  the 

frequency  domain  do  not  summate  linearly.  It  seems 

reasonable  to  assume  that  the  contributions  to  utterance 

intelligibility  of  the  phoneme  sequences  and  of  the  prosodies 

singly  and  in  combination  also  will  not  summate  linearly. 

While  it  was  not  possible  to  investigate  this  subject  during 

the  course  of  the  research,  the  technique  described  should 

*  1 1  ’ 

provide  the  basis  for  investigating  this  aspect  of  speech 
intelligibility. 

B.  Lexical  Units.  In  natural  language  the  concept 
of  lexical  unit  must  be  interpreted  in  a  very  broad  sense. 

In  the  case  of  spoken  language,  the  elemental  meaningful 
units  are  denoted  both  by  phoneme  sequences  and  by* 
prosodemes.  The  morpheme  has  been  considered  the  basic 

meaningful  unit  of  grammar  in  this  study.  An  attempt  was 

!  .  I  , 

first  made  to  specify  an  orthographic  morpheme  for  a 

graphemic  system  of  writing.  A  language  may  be  written 

t 

with  graphemes,  with  a  syllabary,  or  with  ideographs,  but 
the  specification  of  the  orthographic  morpheme  which  has 
been  developed  is  restricted  to  sequences  of  graphemes. 

An  initial  formulation  of  the  morpheme  in  spoken  language 
has  also  been  constructed.  The  work  which  has  been  done  is 
only  preliminary  and  is  not  at  present  ready  for  publication. 

I 

The  principal  investigator  expects  to  continue  work  in  this 
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area,  however,  particularly  on  the  concept  of  the  morpheme 
in  spoken  language. 

C.  Format  for  Syntactic  Description.  Because  of 
the  tremendous  complexity  of  natural  language,  the  develop¬ 
ment  of  a  suitable  format  for  expressing  the  structure  of 
the  syntax  of  spoken  language  presents  a  major  problem. 

We  have  found  previously  that  expressing  the  phonology  of  a 
dialect  entirely  in  the  form  of  rules  results  in  an 
intricate  and  relatively  obscure  description.  The  format 
earn  be  made  more  simple,  more  convenient,  and  easier  to 
interpret  by  the  use  of  a  system  of  reference  tables  to 
which  more  general  rules  refer. 

It  seemed  reasonable  that  this  method  might  also 
be  of  value  at  the  syntactic  level.  Reference  tables  make 
it  possible  to  specify  allowable  and  excluded  sequences 
in  a  relatively  convenient  and  direct  manner.  Since  French 
was  the  native  language  of  the  investigator  for  this  part 
of  the  study,  the  generation  of  verbal  forms  in  French  was 
selected  as  the  topic.  A  generative  format  was  employed  and 
it  was  found  that  a  relatively  simple  set  of  rules  could  be 
employed  to  refer  to  a  set  of  tables  for  generating  the 
various  French  verbs.  There  are,  of  course,  many  irregular 
forms  and  many  forms  which  do  not  occur.  The  tables  make 
it  possible  to  express  all  of  these  various  conditions  in 
a  relatively  complete  and  compact  manner. 


While  the  above  indicated  investigation  was  concerned 
primarily  with  a  very  small  part  of  the  total  syntax  of  the 
language ,  it  offers  an  approach  which  seems  worthy  of  much 
further  consideration.  Ultimately  it  should  be  possible  to 
formulate  such  grammatical  descriptions  in  either  an  analytic 
or  a  generative  form,  that  is  for  either  the  analysis  of 
utterances  in  a  dialect  or  for  generating  utterances  in  a 
dialect, 

English  Grammar.  Most  of  the  research  which  has 
been  done  on  automatic  speech  recognition  has  been  concerned 
with  English,  As  a  result,  English  grammar  is  of  particular 
interest  to  those  working  in  the  field  of  speech  automation. 
During  the  course  of  the  present  project  consideration  has 
been  given  to  the  most  appropriate  form  for  a  grammar  for 
automatic  speech  recognition.  The  construction  of  a  total 
grammar  is  a  problem  which  obviously  extends  far  beyond  the 
scope  of  the  present  study.  As  a  result  a  relatively  small 
problem  in  English  grammar  was  selected  for  investigation. 

The  problem  was  further  simplified  by  disregarding  the 
phonological,  particularly  the  prosodic,  aspect  of  the 
grammar .  Thus  it  was  formulated  primarily  in  terms  of 
orthography . 

Determiner  phrases  in  English  were  chosen  as  the 
subject  of  investigation.  The  study  began  with  an  attempt 
to  write  a  complete  generative  statement  for  a  selected 


set  of  determiner  phrases.  Many  difficulties  arose  in 
the  formulation.  It  proved  to  be  extremely  difficult  to 
generate  all  reasonable  forms  and  to  exclude  all  unreasonable 
forms.  This  circumstance  lead  to  a  detailed  consideration 
of  the  place  of  semantics  in  syntactic  descriptions  and 
of  the  properties  required  of  a  syntactic  description.  The 
study  has  suggested  that  the  syntactic  part  of  a  grammatical 
description  may  be  less  important  than  the  semantic  part. 

The  study  has  emphasized  the  importance  of  finding  a  practical 
and  effective  way  to  manage  semantic  data  in  grammatical 
descriptions. 

3.  Personnel 

The  following  students  have  been  employed  on  a  part- 
time  basis  on  the  project  with  the  Directorate  of  Information 
Sciences  during  the  past  two  year  period  of  the  grant. 

Andre-Pierre  Benguerel,  Communication  Sciences 

Ralph  H.  Fertig,  Mathematics 

John  R.  Hanne,  Communication  Sciences 

George  I*.  Huttar,  Linguistics 

James  A.  Mason,  Communication  Sciences 

4 .  Publications 

Most  of  the  areas  discussed  above  involve  basic 
problems  which  require  continued  investigation.  Within 
the  course  of  the  project,  however,  certain  particular 
studies  were  completed  and  were  prepared  for  publication. 
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Manuscripts  describing  work  carried  out  on  the  project 
with  the  Information  Sciences  Directorate  are  as  follows: 

Michael  H.  O'Malley  and  Gordon  E.  Peterson, 

An  Experimental  Method  for  Prosodic  Analysis, 

Phonetica.  (accepted  for  publication) . 

Some  of  the  more  difficult  questions  in  the  study 
of  language  involve  the  nature  and  function  of  the  prosodies. 
While  the  prosodies  have  been  investigated  by  observing 
their  acoustic  correlates  and  by  varying  the  relevant 
acoustical  parameters  in  synthetic  speech,  the  use  of 
distorted  natural  speech  also  provides  an  effective  procedure 
for  perceptual  studies.  In  this  study  a  technique  for 
reducing  a  speech  wave  to  the  acoustic-prosodic  parameters 
of  speech  power,  phonetic  duration,  and  fundamental  voice 
frequency  was  developed.  With  the  system  described,  all 
suprasegmental  information  based  on  the  prosodic  parameters 
is  transmitted  while  all  segmental  information  is  destroyed. 
The  technique  consists  of  multiplying  the  input  by  a  random 
telegraph  wave,  thus  flattening  the  power  spectrum.  Harmonics 
of  the  fundamental  frequency  are  then  optionally  reintro¬ 
duced  to  provide  fundamental  frequency  information.  Listening 
tests  showed  that  phoneme  intelligibility  was  almost  elimi¬ 
nated  while  intonation  and  stress  were  only  slightly  affected. 
Furthermore,  eliminating  the  fundamental  voice  frequency 
caused  the  perception  of  intonation  to  approach  the  chance 
level  while  the  perception  of  stress  was  only  slightly 
affected.  The  technique  should  be  useful  for  investigating 
the  role  of  the  prosodies  in  grammatical  structures. 


Andre-Pierre  Benguerel,  Generation  of  Verbal 
Forms  in  French,  The  International  Journal  of 
American  Linguistics  (submitted  for  publication) . 


This  paper  presents  a  generative  grammar  of  French 
verbal  forms.  It  consists  of  an  ordered  set  of  rewrite 
rules  and  of  a  set  of  tables.  It  generates  all  existing 
verbal  forms  without  generating  any  nonexisting  ones.  To 
shorten  the  part  of  the  grammar  employing  rewrite  rules, 
symbols  with  indices  have  been  used.  The  tables  present 
stem  and  ending  distributions  in  matrix  form  and  the 
indices  of  the  complex  symbols  correspond  to  the  different 
row  and  column  headings  of  these  matrices.  If  the  values 
of  the  indices  are  chosen  in  such  a  way  as  to  correspond 
to  a  nonexisting  form,  i.e,  to  an  empty  entry  in  a  matrix, 
the  whole  string  is  deleted  and  no  incorrect  form  can  be 
generated. 
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The  verbs  ard  distributed  into  classes,  according 
to  their  stem  distribution,  stem  formation,  and  endings. 

A  good  compromise  is'  reached  betweeh  a,  large  number  of 
classes  with  few  stems  and  a  small  humber  of  classes  with 


many  stems,  a  large  number  of  which  would  often  be  identical. 
The  number  of  ending;  paradigms  has  also  been  kept  at  a 
minimum.  Nevertheless  the  use  of  more  than  one  ending 
paradigm  has  proven  Worthwhile  ih  decreasing  the  number 
of  verb  classes.  The  presentation  of  the  tnatetial  in 
tdbrilar  form  may  appear  to  be  lengthy,  but  actually  it 
makes  possible  the  presentation  of  a  large  amount  Of 
grammatical  detail  in  a  compact  form.  Although  the  total 
number  of  rules  (or  of  choices  to  be  made)  may  be  larger 
than  in  a  morphophonetnic  description,  the  description  is 
,  more  exact  and  the  average  number  of  choices  per  production 
is  smaller.  In  other  words,  once  we  have  selected  a  verb 
class,  the  other  62  glasses  are  excluded  and  the  number  of 
choices  that  remain  to  be  made  is  certainly  smaller  than 
H  the  number  of  exceptions  that  would  have  to  be  looked  up 
in  a  complete  morphophonemic  description. 

The  departure  from  an  ordinary  generative  grammar 
lies  in  the  use  of  a  tabular  form  for  presenting  the  lexical 
material.  This  return  to  a  presentation  often  found  in 
traditional  grammars  has  several  advantages:  1)  It  makes 
it  more  readable  to  anyone  who  wants  to  follow  through  the 
generation  of  a  verb  form.  2)  It  is  readily  usable  in 
a  computer  program.  |3)  It  naturally  complements  complex 
symbols,  since  row  arid  column  headings  of  the  matrices  are 
* actual  realizations  of  the  indices.  4)  It  can  be  used  as 
a  teaching  tool  with  little  modification,  5)  It  is 
probably  closer  to  the  intuition  of  the  literate  native 
speaker  than  a  system  consisting  purely  of  rewrite  rules, 
i,  without  any  device  such  as  complex  symbols  or  tables, 
v 1  .  •  ■ 

[  •  ■  "• 

James  A.  Mason  and  Gordon  E.  Peterson,  On  the 

Problem  of  Describing  the  Grammar  of  Natural 

Languages,  Language  and  Speech,  (accepted  for 

publication). 

Difficulties  encountered  in  an  attempt  to  describe 
the  syntax  of  English  determiner  phrases  resulted  in  a 
reconsideration  of  thjs  purpose  and  organizational  principles 
of  grammatical  descriptions.  Some  illumination  of  the 
nature  of  grammatical  descriptions  is  obtained  by  a  considers 
tion  of  systems  of  chess  notations.  Problems  of  grammatical 
description  discussed  with  reference  to  two  specific  examples 
of  chess  notations  include: 


(1)  The  problem  of  describing  the  "basic 
regularities"  which  determine  how  the 
sentences  of  a  language  are  understood; 

(2)  The  problem  of  translating  between  two 
languages  with  the  same  "universe  of 
discourse"  but  with  different  ways  of 
referring  to  it; 

(3)  The  problem  of  explaining  the  intuitive 
notion  of  "grammaticality"  which  native 
users  of  a  language  possess. 

The  importance,  for  a  useful  language  description, 
of  describing  the  "semantic  interpretation"  process  is 
illustrated  and  emphasized,  and  the  value  of  language 
descriptions  in  the  form  of  generative  grammars  is 
questioned. 
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